A Novel Approach to Model Generation for Heterogeneous Data Classification
نویسندگان
چکیده
Ensemble methods such as bagging and boosting have been successfully applied to classification problems. Two important issues associated with an ensemble approach are: how to generate models to construct an ensemble, and how to combine them for classification. In this paper, we focus on the problem of model generation for heterogeneous data classification. If we could partition heterogeneous data into a number of homogeneous partitions , we will likely generate reliable and accurate classification models over the homogeneous partitions. We examine different ways of forming homogeneous subsets and propose a novel method that allows a data point to be assigned multiple times in order to generate homogeneous partitions for ensemble learning. We present the details of the new algorithm and empirical studies over the UCI benchmark datasets and datasets of image classification , and show that the proposed approach is effective for heterogeneous data classification.
منابع مشابه
A Novel Combinatorial Approach to Discrete Fracture Network Modeling in Heterogeneous Media
Fractured reservoirs contain about 85 and 90 percent of oil and gas resources respectively in Iran. A comprehensive study and investigation of fractures as the main factor affecting fluid flow or perhaps barrier seems necessary for reservoir development studies. High degrees of heterogeneity and sparseness of data have incapacitated conventional deterministic methods in fracture network modelin...
متن کاملA committee machine approach for predicting permeability from well log data: a case study from a heterogeneous carbonate reservoir, Balal oil Field, Persian Gulf
Permeability prediction problem has been examined using several methods such as empirical formulas, regression analysis and intelligent systems especially neural networks and fuzzy logic. This study proposes an improved and novel model for predicting permeability from conventional well log data. The methodology is integration of empirical formulas, multiple regression and neuro-fuzzy in a commi...
متن کاملA Framework for Optimal Attribute Evaluation and Selection in Hesitant Fuzzy Environment Based on Enhanced Ordered Weighted Entropy Approach for Medical Dataset
Background: In this paper, a generic hesitant fuzzy set (HFS) model for clustering various ECG beats according to weights of attributes is proposed. A comprehensive review of the electrocardiogram signal classification and segmentation methodologies indicates that algorithms which are able to effectively handle the nonstationary and uncertainty of the signals should be used for ECG analysis. Ex...
متن کاملModelling the Heterogeneous Price Setting Behavior of Firms (DSGE Approach)
Despite the empirical evidence of the difference in the degree of price stickiness of goods and services, in the new standard Keynesian models, the same price stickiness is considered for all firms producing intermediate goods. In recent years, a new generation of pricing models has been introduced to simulate the heterogeneous price setting behavior in which, unlike standard pricing models, th...
متن کاملPalarimetric Synthetic Aperture Radar Image Classification using Bag of Visual Words Algorithm
Land cover is defined as the physical material of the surface of the earth, including different vegetation covers, bare soil, water surface, various urban areas, etc. Land cover and its changes are very important and influential on the Earth and life of living organisms, especially human beings. Land cover change monitoring is important for protecting the ecosystem, forests, farmland, open spac...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005